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1  Introduction 


Mature  engineering  fields  have  methods  of  construction  that  have  a  high  likelihood  of  success 
and  that  guarantee  the  proper  functioning  of  systems,  even  within  hostile  environments. 
These  methods  relate  behavior  to  structure  and  have  some  underlying  notion  of  composition 
related  to  the  implementation  domain.  Unfortunately,  the  construction  of  computer  systems 
has  not  yet  reached  the  same  level  of  maturity.  While  many  mathematical  theories  have 
been  developed,  they  have  not  yet  been  brought  into  standard  engineering  practice. 

Bridging  this  gap  between  theory  and  engineering  practice  requires  sound  and  pragmatic 
principles  of  construction  and  composition  for  software  systems.  Thus  there  are  at  least  two 
necessary  tasks:  identifying  these  principles,  and  investigating  their  suitability  for  problems 
of  real  engineering  interest.  Our  approach  is  to  adopt  existing  theories  and  technology 
where  possible  and  to  explore  how  they  can  be  applied  to  nontrivial  engineering  applications. 
In  particular,  we  focus  on  higher-order  logic,  category  theory,  and  algebraic  specifications, 
making  significant  use  of  the  Higher  Order  Logic  (HOL)  theorem-prover  [2j  and  Kestrel 
Institute’s  SPECWARE  specification  composition  and  refinement  system  [3]. 

The  method  of  design  in  both  HOL  and  SPECWARE  is  to  construct  small  modules  that 
can  be  composed  and  verified.  The  well-documented  advantages  of  modularity  apply  here, 
as  modular  theories  will  be  more  reusable,  and  easier  to  build  and  verify.  HOL  theories 
are  organized  hierarchically,  so  that  new  theories  can  be  built  by  specialization  of  existing 
theories.  Design  in  SPECWARE  is  a  semi-automatic  process,  in  which  the  designer  creates 
specifications  and  chooses  composition  or  refinement  methods,  which  are  performed  auto¬ 
matically  by  the  system.  Again,  the  creation  of  small  specifications  is  the  preferred  method. 
The  universal  composition  method,  based  on  pushouts  and  colimits  in  category  theory,  com¬ 
poses  specifications  in  a  canonical  way.  The  refinement  methods  can  create  either  C++  or 
LISP  code. 

Our  overall  approach  is  to  build  HOL  theories  that  specify  the  desirable  properties  and 
invariants  that  characterize  the  task,  and  use  HOL’s  theorem-proving  capability  to  verify  the 
soundness  and  completeness  of  the  collection  of  theories.  These  theories  are  then  transformed 
into  SPECWARE  specifications,  which  are  then  refined  into  executable  code.  This  approach 
has  been  used  to  formally  define  and  specified  much  of  a  secure  electronic  mail  protocol,  RFC 
1421  -  Privacy  Enhanced  Mail,  [4];  these  results  have  been  reported  elsewhere  [9,  10,  8]. 

HOL  theories  and  SPECWARE  specifications  are  both  higher-order  theories,  so  the  map¬ 
ping  between  them  is  fairly  straightforward.  However,  there  is  a  technical  difficulty  in  the 
refinement  process,  because  there  are  many  potential  refinements  of  a  SPECWARE  specifica¬ 
tion.  Furthermore,  not  all  refinements  result  in  consistent  specifications  (i.e.,  specifications 
which  can  be  refined  to  meaningful  and  valid  code).  Ultimately,  we  would  like  to  identify 
explicit  principles  of  construction  that  ensure  the  appropriate  refinements  and  to  explore  the 
applicability  of  these  principles. 

This  report  describes  an  important  first  step,  namely  the  formulation  in  higher-order  logic 
of  the  primary  concepts  that  underlie  Specware’s  refinement  framework.  Throughout  this 
report,  we  provide  both  high-level,  English-language  explanations  of  the  concepts,  followed 
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by  their  implementation  in  the  logic  of  the  HOL  theorem  prover.  Section  2  covers  the  most 
basic  definitions  of  category  theory  [6],  the  primary  foundation  for  the  rest  of  the  mathemat¬ 
ical  framework.  The  next  three  sections  describe  the  foundations  of  algebraic  specifications 
[1].  Section  3  introduces  signatures ,  which  are  (roughly  speaking)  high-level  abstractions 
that  identify  the  basic  data  types  and  the  basic  operators  of  a  system.  Algebras— which 
provide  interpretations  for  these  signatures — appear  in  Section  4.  To  constrain  the  possible 
interpretations  of  a  signature,  it  is  necessary  to  introduce  further  constraints,  leading  to 
specifications ;  these  are  discussed  in  Section  5.  Finally,  Section  6  describes  possible  future 
work  that  carries  our  approach  further. 


2  Category-Theory  Basics 

In  this  section,  we  give  an  overview  of  the  most  basic  definitions  of  category  theory  (includ¬ 
ing  the  notion  of  category  itself)  necessary  for  understanding  the  mathematical  framework 
underlying  Specware.  Our  aim  here  is  to  provide  English-language  explanations  of  the 
concepts  as  well  as  their  formulations  in  the  higher-order  logic  of  the  HOL  theorem  prover 
[2].  In  fact,  we  shall  follow  this  approach  throughout  this  report. 

The  definitions  in  this  section  are  not  original,  either  in  their  English  forms  or  in  their 
HOL  forms.  For  example,  Pierce  provides  a  significantly  more  complete  introduction  to 
category  theory  [6].  The  HOL  formulations  we  give  in  this  section  are  due  to  Morris  [5] 
and  will  form  the  basis  of  our  own  formulations  in  subsequent  sections;  Agerholm  provides  a 
similar  embedding  of  category  theory  into  HOL,  choosing  a  different  representation  for  the 
categorical  arrows  [7]. 

A  category  C  comprises  a  collection  Oc  of  objects  and  a  collection  Ac  of  arrows  satisfying 
the  properties  detailed  below.  We  often  refer  to  the  elements  of  Oc  as  C-objects  and  to  the 
elements  of  Ac  as  C-arrows. 

•  Each  arrow  is  associated  with  two  objects  called  its  domain  and  codomain.  When  /  is 
an  arrow  whose  domain  and  codomain  are  A  and  B,  respectively,  we  write  /  :  A  ->  B. 

•  For  each  C-object  A,  there  is  an  identity  arrow  idA  :  A  — >  A. 

•  For  each  pair  of  C-arrows  f  :  A  ->  B  and  g  :  B  ->  C,  there  is  a  composite  arrow 

g  o  j  ;  A  — >  C.  The  composition  operator  o  satisfies  the  following  properties: 

Identity  For  any  C-arrow  /  :  A  —>  B, 

f  o  id A  =  /  and  ids  0  f  =  f 

Associativity  For  any  C-arrows  /  :  A  —*  B,  g  :  B  — >  C,  and  h:  C  -*  D, 

ho{go  })  =  (hog)  of 
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For  example,  the  category  Set  is  a  category  of  sets  (as  objects)  and  total  functions 
between  sets  (as  arrows).  The  identity  arrow  is  the  identity  function,  and  the  composition 
in  the  category  is  the  standard  function  composition. 

In  HOL,  a  category  C  can  be  represented  as  a  four-tuple  of  functions  with  certain  prop¬ 
erties.  A  pre-category  is  a  four-tuple 

(0,  A,  Id,  oo)  :(a  — *  bool)  x 

(a  x  7  x  a  — ►  bool)  x 
(a  — >  a  x  7  x  a)  x 

(aX7Xo-taX7Xa-*Q)(7Xo) 

satisfying  the  following  constraints: 

•  The  function  0  picks  out  C-objects  from  pre-objects  of  HOL  type  a. 

•  The  function  A  picks  out  C-arrows  from  pre-arrows  of  type  0x7x0. 

•  The  function  Id  constructs  an  identity  arrow  for  each  C-object. 

•  The  function  00  constructs  a  composite  arrow  for  two  0-arrows  /  and  g,  provided  that 

they  are  composable  (i.e.,  when  the  domain  of  g  is  the  codomain  of  /). 

As  a  technical  aside,  we  point  out  the  name  00  is  used  for  the  categorical  composition 
operator  to  avoid  confusion  with  HOL’s  built-in  composition  operator  o. 

Each  arrow  in  the  category  is  represented  by  a  triple  (d,  f,c)  of  type  0x7x0,  where 
d,  f  and  c  correspond  to  the  arrow’s  domain,  the  arrow  itself,  and  its  codomain,  respectively. 
The  accessor  functions  dom  and  cod  return  the  domain  and  codomain  of  a  given  arrow.  The 
property  composable  asserts  that  two  arrows  of  a  given  category  are  composable,  and  the 
property  cpsl  asserts  that  two  triples  are  arrows  of  a  certain  category  and  that  they  are 
composable.  These  properties  are  summarized  as  follows: 

dom  \-def  Vd  m  c.  dom  (d,m,c)  =  d 

cod  \~def  Vd  m  c.  cod  (d,m,c)  =  c 

composable  \~de{  Vf  g.  composable  f  g  =  (dom  f  =  cod  g) 

cpsl  \~def  VA  f  g.  cpsl  A  f  g  =  A  f  A  A  g  A  composable  f  g _ 

In  Morris’s  treatment  of  category  theory,  whenever  we  are  concerned  only  with  a  func¬ 
tion’s  behavior  over  a  certain  domain,  the  behavior  of  the  function  outside  the  domain  is 
forced  to  be  the  value  ARB.  The  value  ARB  is  based  on  the  Hilbert  operator  £,  and 
ARB  of  any  type  r  is  the  term  e  :  r.T.  Two  definitions  explore  this  idea.  The  predicate 
isRestr  checks  that  the  value  of  function  /  outside  the  truth-set  of  predicate  P  is  ARB,  and 
isRestr2  checks  that  the  value  of  a  curried  function  of  two  arguments  g  outside  the  truth-set 
of  predicate  Q  is  ARB. 

ARB  =  ex:*.  T 

isRestr  \~def  VP  f.  isRestr  P  f  =  f  =  (Ax  ::P.  f  x) 

isRestr2  I ~dej  VQ  £•  isRestr2  Q  g  =  g  —  (Ax.  Ay  ::(Q  x).  g  x  y) _ _ 
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Using  these  definitions,  the  predicate  isCat  (given  in  Figure  1)  checks  whether  a  given 
four-tuple  is  indeed  a  category.  It  is  straightforward  to  see  that  this  HOL  formulation 
captures  all  of  the  important  aspects  of  the  definition  of  category. 


isCat 

1 ~def  VO  A  id  oo. 

isCat  (0,A,id,oo)  = 

(Vf  ::A.  0  (dom  f)  A  0  (cod  f))  A 
isRestr2  (cpsl  A)  oo  A 

(Vf  g  : : A .  composable  f  g  D  A  (oo  f  g))  A 

(Vf  g  :  :  A . 

composable  f  g  D 

(dom  (oo  f  g)  =  dom  g)  A  (cod  (oo  f  g)  =  cod  f))  A 
(Vf  g  h  :  :A. 

composable  f  g  A  composable  g  h  D 
(oo  (oo  f  g)  h  =  oo  f  (oo  g  h)))  A 
isRestr  0  id  A 
(Va  : :  0 .  A  (id  a))  A 

(Va  ::0.  (dom  (id  a)  =  a)  A  (cod  (id  a)  =  a))  A 
(Va  ::0. 

(Vf  : : A.  (dom  f  =  a)  D  (oo  f  (id  a)  =  f))  A 
(Vg  ::A.  (cod  g  =  a)  D  (oo  (id  a)  g  =  g))) 


Figure  1:  The  predicate  isCat. 

Any  four-tuple  that  satisfies  isCat  defines  a  category.  A  compound  type  {a,  7) cat  is 
defined  using  a  type-definition  construct  of  HOL,  where  a  is  the  type  of  pre-objects  and 
(a  x  7  x  a)  is  the  type  of  pre-arrows. 

cat_TY_DEF  hrfe/  3rep.  TYPE_DEFINITION  isCat  rep _ 


3  Signatures 

Having  provided  a  HOL  formulation  of  categories  in  the  last  section,  we  now  turn  our 
attention  to  some  particular  categories  of  importance  to  the  development  of  assured  code  via 
algebraic  specifications.  In  this  section,  we  consider  signatures ,  which  introduce  a  collection 
of  data  types  and  operations  on  those  types.  Signatures  are  purely  syntactic  entities  and 
have  no  meanings  in  and  of  themselves;  in  Section  4,  we  will  examine  algebras,  which  provide 
meanings  for  these  signatures. 

A  signature  S  is  a  pair  ( S ,  Cl),  where  S  is  a  set  of  sorts  (intuitively,  base  types)  and  Cl  is  a 
set  of  function  symbols  (also  called  operators).  Each  function  symbol  p  in  Cl  has  an  associated 
type  (si  x  s2  x  s3  x  ■  ■  •  x  sn)  — »  s0  for  some  n  >  0,  with  each  (for  i  €  {0, 1,  •  •  •  ,n})  a 
member  of  5;  such  an  operator  is  said  to  have  arity  n.  A  function  symbol  with  type  -*  s  is 
called  a  constant  and  is  said  to  have  type  s. 
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St/: 

Sti  (sorts)  : 
color 

Qtt  (function  symbols )  : 
green  :  color 
yellow  :  color 
red  :  color 

chang eColor  :  color  — >  color 


S  bp  ■ 

Sbp  (sorts)  : 
bool  Pair 

Q.bp  (function  symbols) : 

TT  :  bool  Pair 
TF :  bool  Pair 
FT  :  boolPair 
FF :  boolPair 
cycle  :  boolPair  — *■  boolPair 


Figure  2:  Sample  signatures. 


For  example,  Figure  2  contains  two  signatures,  S ti  (for  traffic  lights)  and  S^,  (for  boolean, 
pairs).  The  traffic-light  signature  Eti  has  a  sort  color  to  represent  the  colors  of  a  traffic  light, 
and  three  operators  (i.e. ,  green,  red ,  and  yellow)  of  type  color  to  represent  the  three  possible 
colors  of  a  traffic  light.  It  also  has  an  operator  changeColor  that  changes  the  color  of  the 
traffic  light.  (To  be  pedantic  here,  our  intention  is  that  changeColor  will  eventually  have  a 
certain  behavior.  However,  because  we  are  currently  at  a  purely  syntactic  level,  we  are  only 
indicating  that  there  will  be  some  operator  with  this  name.) 

Likewise,  the  boolean-pair  has  a  sort  boolPair  to  represent  boolean  pairs  and  four  op¬ 
erators  (i.e.,  TT,  TF,  FT,  FF)  of  type  boolPair  intended  to  represent  the  four  possible 
combinations  for  a  boolean  pair.  It  also  has  an  operator  cycle  intended  to  cycle  through  the 
various  boolean  pairs  in  a  particular  sequence. 


3.1  Signatures  as  a  HOL  Type 

We  define  sorts  and  function  symbols  to  be  of  base  types  sort  and  operator  in  HOL.  The  use 
of  new  base  types  gives  us  as  much  generality  as  with  a  type  variable,  because  a  signature 
is  just  a  set  of  symbols  with  special  properties. 

For  a  signature  E  =  (S,  Cl),  S  is  represented  in  HOL  as  a  set  whose  elements  are  of  type 
sort.  Based  on  the  observation  that  every  function  symbol  has  an  input  type  and  an  output 
type,  Cl  is  represented  in  HOL  as  a  set  of  triples  (p,  si,  s) :  operator  x  sort  list  x  sort,  where  p 
is  a  function  symbol,  si  is  the  type  of  input  argument  to  p,  and  s  is  the  return  type  of  p.  The 
function  symbol’s  input  type  is  a  list  of  sorts.  A  constant  c  in  fi  with  type  s  is  represented 
by  a  triple  (c,  [  ],  s),  where  [  ]  is  HOL’s  representation  of  the  empty  fist.  Using  triples  to 
represent  the  elements  of  the  set  Cl  allows  us  to  overload  functions  symbols.  Elements  (which 
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we  shall  call  function  names )  in  ft  can  be  treated  as  primitives.  We  define  accent  Functions 

rho  :  operator  x  sort  list  x  sort  —>  operator , 
arg  :  operator  x  sort  list  x  sort  —*  sort  list, 
ret :  operator  x  sort  list  x  sort  —*  sort 

to  obtain  the  function  symbol,  the  argument  type,  and  the  return  type  of  a  given  function 
name: 

rho  \~dej  Vop  v  s.  rho  (op,v,s)  =  op 

arg  bde!  Vop  v  s.  arg  (op,v,s)  =  v 

ret  \-dej  Vop  v  s.  ret  (op,v,s)  =  s _ 

Because  we  use  sets  extensively,  a  predicate  inS et  is  defined  to  test  the  membership  of  a 

set. 

inSet  b def  Vs  e.  inSet  s  e  =  e  IN  s _ _ _ 

A  pair  ( S ,  ft)  represents  a  signature  if  the  input  and  output  types  of  all  function-names 
in  fl  are  restricted  to  the  sorts  in  S.  The  predicate 

rhoVSRes  :  sort  set  — *  operator  x  sort  list  x  sort  — *  bool 

tests  whether  the  input  and  output  types  of  a  function-name  (op,  si,  s )  are  restricted  to  the 
sorts  in  S.  The  predicate  isSig  :  sort  set  x  (operator  x  sort  list  x  sort)  set  — *  bool  then 
defines  the  subset  of  pairs  that  are  valid  representations  of  signatures. 

rhoVSRes 

b def  VSs  op  si  s.  rhoVSRes  Ss  (op,sl,s)  =  EVERY  (inSet  Ss)  si  A  s  IN  Ss 
isSig 

b def  VSs  Omega.  isSig  (Ss, Omega)  =  (Vr  :: (inSet  Omega).  rhoVSRes  Ss  r) _ 

Finally,  we  define  a  new  type  sig  for  signatures  using  HOL’s  type-definition  construct. 
Accessor  functions  are  defined  in  HOL  to  pick  out  from  a  signature  its  set  of  sorts  S  and  its  set 
of  function  names  fl.  We  also  define  tester  functions  that,  given  a  signature,  check  whether 
a  sort  is  in  the  set  of  sorts  and  whether  a  function-name  is  in  the  set  of  function-names  fl: 

sig_TY_DEF  b def  3rep.  TYPE_DEFINITION  isSig  rep 
sortsSig  b def  Vx.  sortsSig  x  =  FST  (REP_sig  x) 
omegaS ig  b def  Vx.  omegaS ig  x  =  SND  (REP_sig  x) 
inSorts  b def  Vx.  inSorts  x  =  inSet  (sortsSig  x) 

inOmega  b def  Vx.  inOmega  x  =  inSet  (omegaS ig  x) _ 
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3.2  Signatures  as  a  Category 

A  signature  morphism  /  between  two  signatures  S  =  ( S ,  Ct)  and  S  =  (5  ,  £2 )  is  a  pair 
of  functions  fs  :  S  ^  S'  and  f0  :  Q  -+  Q'  such  that  the  mapping  fQ  between  function 
symbols  respects  the  mapping  fs  between  sorts.  That  is,  if  a  function-symbol  p  €  has 
type  (Si  x  s2  x  s3  x  •••  x  sn)  s0,  then  f0{p)  is  a  function  symbol  of  O'  having  type 

(fs(s l)  X  fs(s 2)  X  •••  X  fs{sn))  *  fs(s o)- 

For  example,  recall  the  signatures  of  Figure  2.  If  the  sort  mapping  fs  :  Sti  —*  maps 
the  traffic  light’s  color  sort  to  the  boolean  pair’s  boolPair  sort,  then  fa  :  Qtt  -*■  &bP  should 
map  green  to  a  function  symbol  in  Slbp  that  has  type  boolPair.  Thus  /„  can  map  green  to 
any  of  the  four  operators  of  type  boolPair  (TT,  TF,  FT,  FF )  in  Slbp,  but  it  cannot  map 
green  to  cycle. 

We  can  now  define  the  category  Sig  whose  objects  are  signatures  and  whose  arrows 
are  signature  morphisms.  In  HOL,  we  represent  an  arrow  in  the  category  Sig  by  a  triple 
(d,  (fs,  fo),  c )  having  the  following  type: 

sig  x 

((sort  — *  sort)  x 

((operator  x  sort  list  x  sort )  — »•  (operator  x  sort  list  x  sort)))  x 
sig 

In  this  representation,  d  and  c  are  the  signatures  that  serve  as  domain  and  codomain  of  the 
arrow,  and  (/<*,  fa)  is  the  signature  morphism  itself. 

We  first  define  two  accessor  functions  that  retrieve  the  sort-mapping  and  operator¬ 
mapping  components  of  an  arrow. 

sigMFs.DEF  h def  Vd  fs  fo  c.  sigMFs  (d,(fs,fo),c)  =  fs 

sigMFo_DEF  \~def  Vd  fs  fo  c.  sigMFo  (d.ffs^oj.c)  =  fo _ _ _ _ 

We  then  define  a  predicate  sig  A,  which  identifies  the  signature  morphisms  from  pre-arrow 
triples  (d,  (fs,  fo),  c).  In  particular,  it  ensures  that  fs  and  fo  are  indeed  functions  from  the 
sorts  and  operators  of  d  to  the  sorts  and  operators  of  c;  it  also  checks  that  the  funciton-name 
mapping  fo  respects  the  sort  mapping  fs. 

sigAJDEF 
\-def  sigA  = 

(let  SA  m  = 

isRestr  (inSorts  (dom  m))  (sigMFs  m)  A 
isRestr  (inOmega  (dom  m))  (sigMFo  m)  A 

(Vsn  ::  (inSorts  (dom  m)).  inSorts  (cod  m)  (sigMFs  m  sn))  A 
(Vrl  ::  (inOmega  (dom  m)). 

let  (op,v,s)  =  rl  and  r2  =  sigMFo  m  rl 
in 

(r2  =  (rho  r2,MAP  (sigMFs  m)  v, sigMFs  m  s))  A 
inOmega  (cod  m)  r2) 
in 

SA) _ _ _ 
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The  identity  arrow  for  a  signature  ( S ,  Q)  is  a  pair  of  identity  functions  ids  '  S  —*  S  and 
idci  :  Q  —*  0.  In  HOL  it  is  defined  as  a  predicate  sigld,  as  follows. 

sigld 

hrfe/  sigld  =  (As.  (s,  ((As  :: (inSorts  s) .  s),  (Ar  :: (inOmega  s).  r)),  s)) _ 

The  composition  of  two  signature  morphisms  m  and  n  is  defined  compentwise,  so  that 
their  sort-mapping  functions  are  composed  and  their  operator-mapping  functions  are  com¬ 
posed.  The  HOL  definition  of  this  composition  operation  sigOo  is  as  follows. 

sigOo 

\-def  sigOo  = 

(Am. 

An  :  :(cpsl  sigA  m). 
dom  n, 

((Ax  ::  (inSorts  (dom  n)).  (sigMFs  m  o  sigMFs  n)  x), 

(Ax  ::  (inOmega  (dom  n)).  (sigMFo  m  o  sigMFo  n)  x)), 
cod  m)  _ _ _ _ _ ___ _ 

These  definitions  together  allow  us  to  prove  that  the  four-tuple  (As.T,  sigA,  sigld,  sigOo) 
is  indeed  a  category  (which  we  call  sigCat),  as  evidenced  by  the  following  HOL  theorem. 


sigCat_rep_IS_CAT 

[oracles:  #]  [axioms:  ]  0  h  isCat  ((As.  T),  sigA,  sigld,  sigOo) 
sigCat  \~def  sigCat  =  ABS.cat  ((As.  T),  sigA,  sigld,  sigOo) 


4  Algebras 

Algebras  give  meaning  to  signatures.  For  any  given  signature,  there  are  many  potential 
algebras,  each  providing  a  different  interpretation  of  the  sorts  and  function  symbols.  For 
a  fixed  signature  E  =  ( S ,  f2),  we  can  talk  about  its  collection  of  models;  these  models  are 
called  E-algebras. 

Each  E-algebra  is  a  pair  (A,  I),  where  A  =  {As  |  s  €  S'}  is  an  5-indexed  set  of  carriers, 
and  J  =  {Ip  |  p  e  ft}  is  an  ^-indexed  set  of  functions.  Furthermore,  we  require  the 
functions  Ip  to  respect  the  typings  of  each  p:  if  the  function  symbol  p  is  assigned  the  type 
S\  x  •  •  ■  x  sn  — *  s,  then  Ip  must  be  a  function  of  type  (ASl  x  AS2  •  •  •  x  ASn)  As.  Intuitively, 
each  carrier  As  provides  a  set  of  values  corresponding  to  the  sort  s.  In  turn,  each  function  Ip 
provides  an  interpretation  of  the  function  symbol  p  as  a  function  over  the  appropriate  sets 
of  data  values. 

For  example,  recall  the  boolean-pair  signature  E*^  from  Figure  2.  One  possible  E&p- 
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algebra  is  ({ AbooiPair },  {Itt,  Itf >  I ft,  Iff ,  Fyc/e})>  where: 


A  bool  Pair 

Itt 

Itf 

Ift 

Iff 

Icycle 


{(T,T),(T,F),(F,T),IPF)} 

(T,T) 

C T,F ) 

(F,T) 

(F,F) 

A(x  :  AbooiPair )• 

if  ar=  (F,F)  then  (F,T) 
else  if  x  =  (F,  T)  then  (F,  F) 
else  if  x=(T,F)  then  (T,T) 
else  (F,  F) 


Note  that  while  this  algebra  reflects  our  probable  intended  interpretation  of  the  operators 
TT,  TF,  FT,  and  FF,  this  interpretation  is  not  the  only  one.  For  example,  consider  the 
(rather  artificial)  algebra  {{BboolPair},  {Jtt,  Jtf ,  Jft ,  ^cyc/e})*  where  Bbooipair  =  N  (the 

set  of  natural  numbers)  and  the  Jp  functions  are  defined  as  follows: 


Jtt  =  0 
=  5 
Jft  “  5 
Jff  —  13 

J cycle  =  A(x  :  N).  if  x  <  2  then  0  else  3 

This  algebra  meets  all  the  necessary  requirements:  each  constant  operator  of  sort  boolPair 
is  mapped  to  a  value  from  the  set  B  bool  Pair  ■,  and  the  operator  cycle  is  mapped  to  a  function 
of  type  BboolPair  BboolPair •  In  particular,  it  is  not  necessary  for  the  values  of  BbooiPair  to 
resemble  boolean  pairs  or  to  even  be  in  one-to-one  correpsondence  with  the  constants  of  the 
sort  boolPair.  Similarly,  it  is  okay  for  different  constants  to  be  mapped  to  the  same  value, 
and  the  function  can  return  values  that  are  not  necessarily  assigned  to  a  particular 
constant. 

In  Section  5,  we  will  discuss  how  signatures  can  be  augmented  with  additional  formulas 
or  equations  that  rule  out  certain  undesirable  algebras  (such  as  this  one,  perhaps).  For  now, 
however,  we  focus  on  the  HOL  implementation  of  algebras. 


4.1  Algebras  as  a  HOL  Type 

We  introduce  a  new  base-type  value  in  HOL  to  represent  the  values  that  are  in  a  carrier  set. 
In  HOL,  the  set  of  sort-indexed  carrier  sets  of  a  E-algebra  is  represented  as  a  set  of  pairs 
(v,  s)  :  value  x  sort,  where  v  is  a  value  and  s  is  the  type  of  the  value.  This  representation 
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allows  us  to  overload  symbols  for  values.  Similarly,  the  set  of  functions  of  a  S-algebra  is 
represented  as  a  set  of  functions,  each  taking  a  list  of  values  to  a  value. 

We  use  triples  of  form  (E,  A,  I)  to  represent  E-algebras  in  HOL:  in  each  case,  A  is  a 
HOL  function  that  constructs  a  carrier  set  from  the  sort  of  the  signature  E,  and  /  is  a  HOL 
function  that  maps  a  function-name  to  a  function.  Thus  A  and  I  have  the  following  types 
in  HOL: 


A  :  sort  — >  value  set 

I :  ( operator  x  sort  list  x  sort )  — »  ( value  list  — >  value) 

We  call  A  the  carrier-sets  assignment  function  and  I  the  function-name  assignment  function. 

The  HOL  function  carrierV als  computes  the  carrier  set  of  a  E-algebra,  given  the  carrier- 
sets  assignment  function  A . 

carrierVals 

_ h def  VA  Ss.  carrierVals  A  Ss  -  {(s,  v)  |  inSet  Ss  s  A  inSet  (A  s)  v} _ 

Likewise,  the  function  setOfV Lists  constructs  a  set  from  the  carrier-sets  assignment  func¬ 
tion  A  and  a  sort  list  si  such  that  every  element  of  the  set  is  a  value  list,  and  each  value  in 
the  list  has  the  type  defined  by  the  corresponding  element  in  the  sort  list  sL 


setOfVLists 

I -def  VA  v-  setOfVLists  A  v  =  (xv  |  AND_EL  (MAP2  inSet  (MAP  A  v)  xv)} 


A  triple  (E,  A,  I)  is  a  E-algebra  if  the  mapping  from  function-names  to  functions  is 
consistent  with  the  mapping  from  sorts  to  carrier  sets.  In  HOL,  this  property  is  defined  as 
a  predicate  isAlg : 

isAlg 

\~def  Vsigma  A  Is. 

isAlg  (sigma, A, Is)  = 
isRestr  (inSorts  sigma)  A  A 
isRestr  (inOmega  sigma)  Is  A 
(Vrvs  ::  (inOmega  sigma). 

isRestr  (inSet  (setOfVLists  A  (arg  rvs)))  (Is  rvs)  A 
(Vxv  ::  (inSet  (setOfVLists  A  (arg  rvs))). 

_  inSet  (A  (ret  rvs))  (Is  rvs  xv))) _ _ 

This  predicate  isAlg  identifies  the  subset  of  triples  that  are  valid  representations  of 
algebras.  Using  HOL’s  type-definition  construct  with  this  predicate,  we  define  a  new  type 
alg  for  algebras: 

alg.TY.DEF  hdef  3rep.  TYPE.DEFINITION  isAlg  rep 

alg_ISO_DEF 

h  def  (Va.  ABS.alg  (REP.alg  a)  =  a)  A 

_ (Vr.  isAlg  r  =  REP_alg  (ABS_alg  r)  -  r) _ _ _ 
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Finally,  we  define  accessor  functions  sigAlg,  carrier  Alg,  and  funsAlg  that  extract  the  sig¬ 
nature,  the  carrier  sets,  and  the  interpretation  of  function  names  from  an  algebra. 

sigAlg  \~dcj  Vx.  sigAlg  x  =  FST  (REP_alg  x) 

carrierAlg  \~dej  Vx.  carrierAlg  x  =  FST  (SND  (REP_alg  x)) 

funsAlg  \~def  Vx.  funsAlg  x  =  SND  (SND  (REP_alg  x))  _ 


4.2  E- Algebras  as  a  Category 

A  E-homomorphism  between  two  E-algebras  (A  I)  and  {A' )  /')  is  a  collection  of  functions 
hs  :  As  — >  A's  such  that,  for  a  E  operator  p  with  type  si  x  •  •  •  x  sn  — >  s, 

hs  (/p(^l  >  ^2)  •••>  ^n))  =  ^1  >  hjj  ^2?  •  •  •  i^s,,  ^n)* 

Intuitively,  this  equation  says  that  E-homomorphisms  preserve  the  algebraic  structure:  ap¬ 
plying  the  /-interpretation  of  the  operator  p  to  appropriate  values  iq , . . .  ,  vn  and  then  trans¬ 
lating  that  result  to  a  value  in  A’s  yields  an  equivalent  result  as  first  translating  each  of  the 
values  Vi  to  elements  of  A's.  and  then  applying  the  /'-interpretation  of  p  to  those  values. 

For  any  signature  E,  AlgE  is  the  category  whose  objects  are  E-algebras  and  whose  arrows 
are  E-homomorphisms.  The  identity  arrows  are  simply  those  homomorphisms  for  which  each 
hs  is  the  identity  function,  and  composition  is  standard  function  composition  (it  is  easy  to 
verify  that  the  composition  of  two  homomorphisms  is  indeed  a  homomorphism). 

In  HOL,  we  define  the  category  AlgE  in  such  way  that  the  signature  E  is  taken  as  a 
parameter.  The  predicate  isSigmaAlg  selects  E-algebras  from  the  set  of  all  algebras. 

isSigmaAlg 

\~def  Vsigma.  isSigmaAlg  sigma  =  (let  P  a  =  (sigAlg  a  =  sigma)  in  P) _ 

In  HOL,  an  arrow  in  the  category  AlgE  is  represented  as  a  triple 

(m,  h,  n)  :  alg  x  ( sort  — »  value  —>  value)  x  alg , 

where  m  and  n  are  the  domain  and  codomain  E-algebras  of  the  arrow  and  h  :  sort  — ► 
value  —>■  value  is  a  function  that  maps  the  elements  of  m’s  carrier  sets  to  elements  of  the 
corresponding  carrier  sets  of  n.  The  predicate  algHom  defines  a  homomorphism  between 
two  algebras  that  have  the  same  signature.  The  predicate  sigmaAlgA  picks  out  AlgE  arrows 
from  pre-arrows  with  the  help  of  algHom. 
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The  identity  arrow  in  the  category  Algs  is  the  function  (As./  :  value  — »  value).  The 
composition  of  arrows  (a,  /,  b )  and  (b,  g ,  c)  is  defined  to  be  (a,  Xs.((g  s )  o  (/  s)),  c). 

The  four-tuple  algCat  that  represents  the  Algs-category  is  defined  as  follows. 


Proving  that  algCat  represents  a  category  amounts  to  proving  the  following  goal: 
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Due  to  time  limitations,  we  have  not  yet  performed  this  proof  within  HOL.  However,  based 
on  our  prior  experience  with  HOL  and  our  knowledge  that  the  underlying  category  theory 
is  correct,  we  are  confident  that  this  goal  could  be  proven  with  HOL  without  significant 
difficulties. 

4.3  Algebraic  Terms 

Algebraic  specifications  describe  abstract  data  types  (ADTs)  by  augmenting  signatures  with 
descriptions  of  the  characteristic  properties  of  the  ADTs.  These  properties  of  ADTs  can  be 
expressed  as  formulas  (such  as  equations)  on  the  terms  of  E-algebras. 

To  define  the  algebraic  terms  associated  with  a  given  signature  E  =  (S,  0),  we  begin 
with  an  infinite  set  V  of  symbols  called  variables  assumed  to  be  distinct  from  all  the  sorts 
and  operator  symbols  in  E.  A  sort  assignment  T  is  a  finite  set  of  pairs  ( x ,  s),  where  x  €  V 
is  a  variable  and  s  €  S  is  a  sort;  T  must  be  consistent,  in  that  it  may  associate  at  most 
one  sort  with  any  particular  variable.  We  then  can  define  by  mutual  induction  a  family  of 
sets  Terms (E,  T)  =  { Terms5 (E,  T)  |  s  G  S'}  as  follows,  where  each  set  Tenrcss(E,r)  is  the 
collection  of  E -terms  of  sort  s  under  the  sort  assignment  T : 

•  If  ( x ,  s)  is  in  T,  then  x  is  in  Termss( E,  T). 

•  If,  for  each  i  G  {1, 2,  •  •  •  ,  n),  U  is  a  term  in  Terms5* (E,  T),  and  if  p  is  a  function  symbol 
of  type  (si  x  s2  x  •  •  •  x  s„  —>  s),  then  p(tlf  t2,  •  •  •  ,  tn)  is  a  term  in  the  set  Termss( E,  T). 

A  E-equation  with  respect  to  the  sort  assignment  T  is  a  pair  of  terms  (fi,  t2)  such  that  tj 
and  t2  are  both  elements  of  Termss  (E,  T),  for  some  sort  s.  Such  an  equation  is  conventionally 
written  as:  t\  =$  t2[r]. 

To  define  the  algebraic  terms  of  a  signature  E  in  HOL,  we  first  introduce  a  new  base 
type  variable  to  represent  our  variables.  We  then  represent  a  sort  assignment  as  a  set  of 
pairs  ( x ,  s)  :  variable  x  sort,  where  x  is  a  variable  and  s  €  S  is  its  corresponding  sort.  The 
HOL  predicate  gammaRes  identifies  those  sets  that  are  valid  sort  assignments  under  the 
signature  E  =  ( S ,  fi). 

gammaRes 

def  Vsigma  gamma. 

gammaRes  sigma  gamma  = 

(V(v,s)  :: (inSet  gamma).  inSorts  sigma  s) _ 

Note  that  this  implementation  does  not  need  to  verify  that  each  sort  assignment  T  is  consis¬ 
tent,  because  each  variable  v  will  always  appear  along  with  its  sort  in  a  pair  (v,  s).  Intuitively, 
each  pair  (v,  s )  can  be  viewed  as  representing  a  variable  vs  of  sort  s,  and  hence  (for  example) 
the  pairs  (x,  int)  and  (x,  bool)  represent  distinct  variables  xint  and  Xbooi- 

We  first  define  a  recursive  HOL-type  ppreT,  which  provides  an  abstract-syntax  repre¬ 
sentation  for  E-terms  and  their  associated  types.  Strictly  speaking,  this  new  type  is  slightly 
more  general  than  our  desired  E-terms,  as  it  will  also  contain  items  that  technically  are  only 
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portions  of  valid  E-terms.  For  example,  if  an  operator  p  has  type  (si  x  s2  x  •  •  •  x  s„)  — ►  s 
and  t,  is  a  term  of  type  s1;  then  p  tx  is  not  itself  a  E  term.  However,  it  is  convenient  to 
allow  such  partial  terms  in  our  abstract  syntax,  and  we  shall  be  able  to  easily  pick  out  the 

valid  terms  from  the  set  of  ppreT  values. 

The  recursive  type  ppreT  has  the  following  three  constructors: 

Leafv  :  ( variable  x  sort )  — >  ppreT 
Leafo  :  ( operator  x  sort  list  x  sort )  -►  ppreT 
Comb  :  ppreT  — *  ppreT  — *  ppreT 

A  value  of  form  Leafv(x,  s )  represents  a  variable  x  of  sort  s.  A  value  of  form  Leafo(op,  [si,  s2)  ,  sn],  s ) 

represents  an  operator  op  of  type  (si  x  s2  x  •  •  •  x  sn)  — >  s.  Finally,  a  value  of  Comb  t\  t2 

corresponds  to  a  partially  instantiated  term. 

We  define  a  HOL  predicate  isPreSigmaTerm  that  determines  the  appropriate  type  for 
each  element  of  ppreT  as  follows  (HD  and  TL  return  the  head  and  tail  of  a  list): 


_ _ ^  inSet  gamma  (x,  s)gammaRes  sigma  gamma 

isPreSigmaTerm  sigma  gamma  (\\,s)(Leafv  (x,s )) 


inOmega  sigma  f 

isPreSigmaTerm  sigma  gamma  (si,  s)  (Leafo  (op,  si,  s))  gammaRes  sigma  gamma 


isPreSigmaTerm  sigma  gamma 

isPreSigmaTerm  sigma  gamma  ([  ],  s2)t2  gammaRes  sigma  gamma 

IJPreSigmxiTerm  sigma  gamma  (TL  fi,  S\)(Comb  T\  t2)  s2  =  HD  V\ 

A  E-term  of  type  s  under  T  is  simply  an  element  of  ppreT  whose  associated  type  is  ([  ],  s). 
In  HOL,  these  terms  can  be  represented  as  four-tuples  (E ,T,t,s) :  sig  x  (variable  x  sort)set  x 
ppreT  x  sort.  The  predicate  isSigmaTerm  identifies  those  four-tuples  that  are  valid  repre¬ 
sentations  of  E-terms.  Using  HOL’s  type-definition  construct,  we  can  also  introduce  a  new 
HOL  type  sigmaTm. 


s igmaTm_TY_DEF 

[oracles:  #]  [axioms:  ]  0  bde/  3rep.  TYPE.DEFINITION  isSigmaTerm  rep 
s igmaTm_ IS0_DEF 
[oracles:  #]  [axioms:  ]  Q 

1 -dej  (Va.  ABS_sigmaTm  (REP_sigmaTm  a)  =  a)  A 

(Vr.  isSigmaTerm  r  =  REP.sigmaTm  (ABS_sigmaTm  r)  =  r) _ . 

We  define  accessor  functions  sig  SigmaTm,  gammaSigmaTm,  tmSigmaTm,  and  ty  SigmaTm 
that  extract  the  signature,  the  variable  assignment,  the  term  itself,  and  its  type  from  a  E- 
term. 
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sigSigmaTm  \~def  Vx.  sigSigmaTm  x  =  FST  (REP.sigmaTm  x) 
gammaSigmaTm  \~~def  Vx.  gammaSigmaTm  x  =  FST  (SND  (REP_sigmaTm  x)) 
traSigmaTm  \~def  Vx.  tmSigmaTm  x  =  FST  (SND  (SND  (REP.sigmaTm  x))) 
tySigmaTm  dej  Vx.  tySigmaTm  x  =  SND  (SND  (SND  (REP_sigmaTm  x))) 

A  E-equation  is  a  pair  of  E  terms  (t  1 ,  ^2)  :  sigmaTm  x  sigmaTm  where  t\  and  t2  are  of 
the  same  type.  The  predicate  isSigmaEq  identifies  those  pairs  of  E-terms  that  satisfy  this 
constraint,  and  we  use  this  predicate  to  define  a  HOL  type  sigmaEq  for  E-equations. 


isSigmaEq 
\-def  Vtl  t2 . 

isSigmaEq  (tl,t2)  — 

(sigSigmaTm  tl  =  sigSigmaTm  t2)  A 
(gammaSigmaTm  tl  =  gammaSigmaTm  t2)  A 
(tySigmaTm  tl  =  tySigmaTm  t2) 
sigmaEq_TY_DEF  bdey  3rep.  TYPE_DEFINITION  isSigmaEq  rep 
s i gmaEq_ ISO  _DEF 

b def  (Va.  ABS^sigmaEq  (REP_sigmaEq  a)  =  a)  A 

(Vr.  isSigmaEq  r  =  REP_sigmaEq  (ABS^sigmaEq  r)  =  r) 


We  introduce  accessor  functions  leftTermEq  and  rightTermEq  that  extract  the  left  and 
right  terms  from  a  E-equation.  Likewise,  we  introduce  functions  sigSigmaEq ,  gammaSigmaEq, 
ItmSigmaEq ,  rtmSigmaEq ,  and  ty SigmaEq  that  obtain  the  signature,  the  variable  assign¬ 
ment,  the  left  term,  the  right  term,  and  the  terms’  type  from  an  arbitrary  E-equation. 

leftTermEq  Y~def  Vx.  leftTermEq  x  =  FST  (REP_sigmaEq  x) 
rigthTermEq  b def  Vx.  rightTermEq  x  =  SND  (REP_sigmaEq  x) 
sigSigmaEq  b def  Vx.  sigSigmaEq  x  =  sigSigmaTm  (leftTermEq  x) 
gammaS igmaEq  \~def  Vx.  gammaS igmaEq  x  =  gammaSigmaTm  (leftTermEq  x) 

ItmSigmaEq  b def  Vx.  ItmSigmaEq  x  =  tmSigmaTm  (leftTermEq  x) 
rtmSigmaEq  b def  Vx.  rtmSigmaEq  x  =  tmSigmaTm  (rightTermEq  x) 

tySigmaEq  b def  Vx.  tySigmaEq  x  =  tySigmaTm  (leftTermEq  x) _ 


5  Algebraic  Specifications 

As  we  have  discussed  in  previous  sections,  every  signature  has  a  collection  of  models,  not 
all  of  which  necessarily  capture  our  intentions.  To  reduce  this  collection  to  those  models 
that  do  capture  the  intended  meaning,  it  is  necessary  to  add  constraints  to  the  signatures 
that  limit  the  potential  models.  These  constraints  can  be  represented  as  equations  that  are 
added  to  a  given  signature;  the  result  is  an  algebraic  specification. 

An  algebraic  specification  is  a  pair  (E,E),  where  E  is  a  signature  and  E  is  a  set  of  E- 
formulas  that  serves  as  axioms  of  the  specification.  Algebraic  specifications  can  be  used  to 
specify  computer  systems,  where  E  describes  the  interface  of  the  system  and  E  is  the  desired 
system  properties. 

For  example,  we  can  define  TL-spec  =  (Et;,  Eti)  as  a  specification  for  the  traffic  light 
and  BP- spec  =  (E^,  E^p)  as  a  specification  for  the  boolean  pair,  where  E*i  and  E&p  are  the 
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signatures  described  on  page  6.  With  E tu  the  specification  TL-spec  states  that  a  traffic  light 
has  exactly  one  of  the  three  distinct  colors.  With  SbP,  the  specification  BP-spec  states  that 
a  boolean  pair  has  exactly  one  of  the  four  distinct  values. 


Ea : 

color- distinct:  ( green  7^  yellow )  A  ( yellow  7^  red )  A  {red  7^  green ) 
colgrzcases:  for  each  color  x ,  {x  =  green )  V  (x  =  yellow )  V  {x  =  red) 

Ebp  • 

boolP  air- distinct: 

{TT  7^  TF)  A  (TT  7^  FT)  A  (TT  7^  FF)  A  (TF  ±  FT)  A  (TF  7^  FF)  A  (FT  7^  FF) 
boglPairzcgjSes:  for  each  boolPair  y,  (y  =  TT)  V  (y  =  TF)  V  (y  =  FT)  V  (y  —  FF) 

A  E-algebra  (A,/)  is  a  model  of  a  specification  (E,F)  provided  that  (A,/)  satisfies  all 
the  formulas  in  E.  That  is,  for  every  possible  variable  assignment,  all  the  E-formulas  in  E 
hold  with  respect  to  the  carrier-set  assignment  A  and  the  function-symbol  assignments  I. 


5.1  Specification  as  a  HOL  Type 

In  HOL,  we  represent  specifications  by  triples  (E,  T,  E)  of  type  sig  x  {variable  x  sort)  set  x 
sigmaEq  set,  so  that  E  is  a  signature,  T  is  a  variable  assignment,  and  F  is  a  set  of  E- 
equations. 

The  predicate  isSpec  identifies  the  subset  of  these  triples  that  correspond  to  valid  alge¬ 
braic  specifications. 


isSpec 

I -def  Vsigma  gamma  E. 

isSpec  (sigma, gamma, E)  = 

(VsigEq  ::  (inSet  E). 

(sigSigmaEq  sigEq  =  sigma)  A  (gammaS igmaEq  sigEq  =  gamma)) _ _ 

Using  this  predicate  along  with  HOL’s  type-definition  construct,  we  can  then  define 
algebraic  specifications  as  a  new  HOL  type  spec. 

spec _TY_DEF  \-dej  3rep.  TYPE _DEFINITI ON  isSpec  rep 
spec_ISCLDEF 

\-def  (Va.  ABS_spec  (REP_spec  a)  =  a)  A 

(Vr.  isSpec  r  =  REP_spec  (ABS.spec  r)  =  r) _ _ _ _ _ _ _ 

As  before,  we  also  define  accessor  functions  sig  Spec,  gammaSpec,  and  ESpec  that  ex¬ 
tract  the  signature,  the  variable  assignment,  and  the  set  of  E-equations  from  a  specification. 


sigSpec  \~def  Vx .  sigSpec  x  =  FST  (REP_spec  x) 
gammaSpec  \-de}  Vx.  gammaSpec  x  =  FST  (SND  (REP.spec  x)) 
ESpec  \-jef  V*.  ESpec  x  =  SMD  (SND  (REP.spec  x))  , 
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5.2  Algebraic  Specifications  as  a  Category 

The  category  Spec  is  the  category  of  algebraic  specifications:  its  objects  are  specifications, 
and  its  arrows  are  specification  morphisms.  A  specification  morphism  /  between  two  spec¬ 
ifications  (E,  E)  and  (E',  E')  is  a  signature  morphism  between  E  and  E'  that  preserves 
theorems:  /  takes  any  axiom  (i.e.,  E- formula)  in  E  either  to  an  axiom  in  E'  or  to  a  theorem 
deducible  from  the  axioms  in  E' . 

We  represent  a  specification  morphism  in  HOL  as  a  triple  of  functions  fs  :  S\  — *•  S2, 
f0  :  fti  — ►  ft2  and  fv  :  Ti  -+  T2.  The  pair  (fs,  f0 )  is  a  signature  morphism,  and  fv  provides 
a  mapping  between  the  two  variable  assignments  of  the  specifications.  This  mapping  fv  is 
necessary  at  this  stage  to  avoid  the  complex  a-conversion  of  terms. 

An  arrow  in  the  category  Spec  is  therefore  represented  as  a  triple  ( d ,  (fs,  fo,  fv),c )  with 
the  following  type: 

spec  x 

((sort  —>  sort )  x 

((operator  x  sort  list  x  sort )  — ►  (operator  x  sort  list  x  sort)) 

((variable  x  sort)  — >  (variable  x  sort)))  x 
spec 

As  before,  we  define  accessor  functions  that  retrieve  the  various  components  of  a  specification 
arrow: 

specMFs_DEF  1 -<je/  Vd  fs  fo  fv  c.  specMFs  (d,(fs,fo,fv),c)  =  fs 
specMFo.DEF  \~def  Vd  fs  fo  fv  c.  specMFo  (d,(fs,fo,fv),c)  =  fo 
_ specMFv_DEF  \~de}  Vd  fs  fo  fv  c.  specMFv  (d,(fs>fo,fv)>c)  =  fv  _ 

The  signature  arrow  ( fs ,  f0)  :  (S,  ft)  — >•  (S',  ft')  and  the  variable-assignment  mapping 
fv  :  T  — >  T'  determine  the  transformation  from  the  E-equations  of  one  specification  to  the 
E-equations  of  another  specification.  The  terms  of  E  (under  T)  can  be  transformed  to  terms 
of  E'  (under  T')  in  the  obvious  inductive  fashion: 

•  Each  (sub)term  of  form  (Leafv  (x,  s))  is  transformed  to  the  term  Leafv  (fv  (x,  s)). 

•  Each  (sub)term  of  form  (Lea  fo  (op,  si,  s))  is  transformed  to  the  term  Leafo  (fo  (op,  si,  s)). 

•  Each  (sub)term  of  form  (Comb  tl  t2)  is  transformed  by  transforming  its  components 
tl  and  t2. 

The  recursive  function  transPPT  is  formalized  in  HOL  as  follows: 
transPPT 

[oracles:  #]  [axioms:  ]  [] 

f"de/  (Vfo  fv  x.  transPPT  fo  fv  (Leafv  x)  =  Leafv  (fv  x))  A 
(Vfo  fv  op.  transPPT  fo  fv  (Leafo  op)  =  Leafo  (fo  op))  A 
(Vfo  fv  ppt2  pptl. 

transPPT  fo  fv  (Comb  pptl  ppt2)  = 

Comb  (transPPT  fo  fv  pptl)  (transPPT  fo  fv  ppt2)) _ 
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The  E'-types  of  the  newly  constructed  terms  are  obtained  by  applying  the  mapping  fs 
to  the  original  E-type  of  a  term,  so  that  each  E-term  t  of  type  s  is  transformed  to  term  t! 
with  type  (fs  s):  _ _ 

transTerm 

I -def  Vfs  fo  fv  sigma2  gamma2  t. 

transTerm  fs  fo  fv  sigma2  gamma2  t  = 

ABS.sigmaTm 

(s igma2 , gamma2 , t r ansPPT  fo  fv  (tmSigmaTm  t),fs  (tySigmaTm  t))  _ 

Finally,  these  transformations  can  be  applied  componentwise  to  transform  a  E-equation 
to  a  corresponding  E'-equation,  as  follows: 

transEq 

b def  Vfs  fo  fv  sigma2  gamma2  eq. 

transEq  fs  fo  fv  sigma2  gamma2  eq  = 

ABS^sigmaEq 
(ABS_sigmaTm 
(sigma2, 
gamma2 , 

transPPT  fo  fv  (ItmSigmaEq  eq), 
fs  (tySigmaEq  eq)), 

ABS_sigmaTm 
(sigma2, 
gamma2 , 

transPPT  fo  fv  (rtmSigmaEq  eq), 

fs  (tySigmaEq  eq))) _ _ _ _ _ _ _ 

Because  specification  morphisms  must  preserve  theorems,  we  need  a  way  to  verify  that 
each  E-equation  is  translated  to  an  equation  derivable  from  the  E'-equations.  Ideally,  we 
would  define  a  function  thms Derivable  :  sigmaEq  set  — *  sigmaEq  set  that  constructs  the 
set  of  all  theorems  derivable  from  a  given  set  of  axioms.  We  could  then  define  a  specification 
arrow  in  HOL  as  follows,  where  the  predicate  inGammaSpec  defines  the  elements  that  are 
in  the  variable-assignment  set  T  of  a  specification  and  the  predicate  specA  picks  out  the 
specification  arrows  from  pre-arrow  triples: 

inGammaSpec  \-def  Vsp .  inGammaSpec  sp  =  inSet  (gammaSpec  sp) 

specA_DEF 

\-def  specA  = 

(let  SA  (d,(fs,fo,fv),c)  = 

sigA  (sigSpec  d,(fs,fo),sigSpec  c)  A 
isRestr  (inGammaSpec  d)  fv  A 
(V(xl,sl)  :: (inGammaSpec  d). 
let  (x2,s2)  =  fv  (xl.sl) 
in 

inGammaSpec  c  (x2,s2)  A  (s2  =  fs  si))  A 
(Veq  : : (inSet  (ESpec  d)) . 

inSet  (thmsDerivable  (ESpec  c)) 

(transEq  fs  fo  fv  (sigSpec  c)  (gammaSpec  c)  eq)) 
in 

SA)  _ _ _ _ 
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This  approach  is  clearly  not  feasible,  as  it  is  impossible  to  construct  the  set  that  contains 
all  the  theorems  derivable  from  a  given  set  of  axioms.  In  fact,  it  is  generally  undecidable 
whether  a  given  formula  is  a  consequence  of  a  collection  of  axioms. 

Based  on  our  limited  experience,  however,  we  believe  that  in  practice  a  user  is  capa¬ 
ble  of  ensuring  that  the  mapping  preserves  theorems.  The  specification  morphisms  used 
to  refine  specifications  in  practice  tend  to  introduce  new  constraints  without  significant 
renaming  or  significant  omissions  (i.e.,  E- axioms  tend  to  be  translated  to  E'-axioms) .  Fur¬ 
thermore,  for  more  complicated  translations,  a  user  could  prove  the  necessary  preservations 
separately  using  the  HOL  theorem  prover.  Having  verified  the  necessary  conditions  for  the 
given  specifications,  the  user  could  then  introduce  a  HOL  definition  that  provides  a  sufficient 
approximation  to  the  set  thmsDerivable(ESpec  c),  namely  a  set  that  contains  precisely  the 
(finite  number  of)  verified  axiom  translations.  Although  such  a  process  is  not  completely 
automated,  it  does  allow  a  user  to  verify  the  validity  of  the  translation  and  to  generate 
assured  specifications  and  refinements. 

Once  specification  arrows  have  been  defined,  we  introduced  a  function  sigASpecA  to 
extract  the  signature  arrow  from  a  specification  arrow. 

sigASpecA.DEF 
I ~def  sigASpecA  = 

_ (let  sAsA  (d,(fs,fo,fv),c)  =  sigSpec  d,(fs,fo),sigSpec  c  in  sAsA) _ 

The  identity  arrow  of  the  object  (E,  E )  in  the  category  Spec  is  the  identity  arrow  of  the 
object  E  in  the  category  Sig,  and  composition  in  Spec  is  defined  in  the  same  way  as  in  the 
category  Sig: 

specId_DEF 
1 ~def  Vsp. 

specld  sp  = 

(let  (fs,fo)  =  mid  (sigld  (sigSpec  sp)) 
and  fv  =  (Ax  :  : (inGammaSpec  sp).  x) 
in 

sp,(fs,fo,fv),sp) 
specOo_DEF 
1 -dej  specOo  = 

(Am. 

An  ::(cpsl  specA  m) . 

let  (fs,fo)  =  mid  (sigOo  (sigASpecA  m)  (sigASpecA  n)) 
and  fv  = 

(Ax  ::  (inGammaSpec  (dom  n)).  (specMFv  m  o  specMFv  n)  x) 
in 

dom  n,(fs,fo,fv),cod  m) 

specCatJlEP  h def  specCat  ==  ((Ax.  T) , specA, specld, specOo) 

Finally,  the  tuple  specCat  could  be  proved  to  represent  a  category  by  proving  the  follow- 
ing  goal. 

val  goal  =  ([],  — ‘isCat  specCat' — );  _ 
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6  Summary  and  Future  Work 

Throughout  this  report,  we  have  identified  many  of  the  categories  and  constructs  underlying 
algebraic  specifications  and  their  interpretations.  Furthermore,  we  have  formulated  them  in 
higher-order  logic  and  (in  most  cases)  verified  the  correctness  of  our  formulation. 

The  purpose  of  computer-assisted  reasoning  is  to  provide  to  help  nonexperts  in  a  given 
domain  to  nonetheless  have  confidence  in  their  analysis.  In  this  work,  we  have  not  uncovered 
new  uses  of  category  theory  or  proved  new  theorems  about  category  theory.  Instead,  we 
have  embedded  category  theory  in  a  form  that  nonexperts  can  use  in  the  future  to  construct 
assured  specifications  and  ultimately  assured  code.  The  objective  of  our  formulation  of 
category  theory  in  HOL  is  to  fully  explicate  the  underlying  principles  of  construction  that 

algebraic  specifications  provide.  .... 

At  present,  this  work  remains  incomplete.  As  discussed  in  Section  5,  verifying  that  a 
specification  morphism  is  valid  introduces  additional  proof  obligations  for  the  user.  We  have 
not  yet  investigated  the  various  mechanisms  for  integrating  these  obligations  into  the  system. 
We  would  like  to  better  understand  the  trade-offs  involved  and  how  well  these  mechanisms 

work  in  practice.  .  , 

In  addition,  we  have  not  yet  implemented  the  categorical  mechanisms  that  underlie  the 

composition  of  specifications  or  refinements  of  specifications.  This  type  of  composition  and 
the  notions  of  refinement  rely  on  categorical  pushouts  (or,  more  generally,  cohmits),  which 
provide  a  canonical  way  to  compose  specifications.  Intuiviely,  a  specification  morphism  / 
from  A  to  B  indicates  how  B  can  be  viewed  as  adding  additional  constraints  to  A.  The 
existence  of  pushouts  in  the  category  Spec  assures  that  whenever  a  specification  A  can  be 
further  constrained  by  two  different  specifications  B  and  C  (via  morphisms  /  and  p),  that 
there  is  a  canonical  specification  D  that  captures  precisely  the  additional  constraints  imposed 
by  both  B  and  C.  Given  such  specification  morphisms  /  :  A  ->  B  and  g  :  A  ->  C,  the 
pushout  can  be  constructed  algorithmically,  and  hence  we  do  not  anticipate  any  significant 
difficulties  formulating  pushouts  in  HOL. 
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